This report presents a median polish decomposition of self-rated health (SRH) across six US surveys. The median polish is a non-parametric method that complements the Bayesian HAPC analysis by extracting additive Age, Period, and Cohort effects without distributional assumptions.
1.1 Method
Compute survey-weighted mean SRH in 5-year age group \(\times\) 5-year period cells
Construct three 2D matrices: AP (age \(\times\) period), AC (age \(\times\) cohort), PC (period \(\times\) cohort)
Apply medpolish() to each matrix to extract row effects, column effects, and residuals
The residuals from each slice reveal the effect of the third dimension (e.g., AP residuals \(\to\) cohort effects)
1.2 Surveys
SRH is coded higher = better in all surveys.
Survey
SRH Scale
Years
Design
GSS
1–4
1972–2024
Weights only
NHANES
1–5
1999–2022
Full (strata + PSU + weights)
MEPS
1–5
2000–2023
Full (strata + PSU + weights)
NHIS
1–5
1983–2024
Full (strata + PSU + weights)
CPS
1–5
1996–2025
Weights only
BRFSS
1–5
1993–2025
Weights only (computational)
Show code
# Load all saved results from the pipelineall_results <-list()all_effects <-tibble()all_direct <-tibble()all_lm <-tibble()all_var <-tibble()for (sv in surveys) { r <-readRDS(file.path(mp_output_dir, paste0("medpolish_full_", sv, ".rds"))) all_results[[sv]] <- r all_effects <-bind_rows(all_effects, r$effects |>mutate(survey =toupper(sv))) all_direct <-bind_rows(all_direct, r$direct_effects |>mutate(survey =toupper(sv))) all_lm <-bind_rows(all_lm, r$lm_summary |>mutate(survey =toupper(sv))) all_var <-bind_rows(all_var, r$var_decomp |>mutate(survey =toupper(sv)))}sv_levels <-toupper(surveys)all_effects <- all_effects |>mutate(survey =factor(survey, levels = sv_levels))all_direct <- all_direct |>mutate(survey =factor(survey, levels = sv_levels))
2 Combined Results
2.1 Mean-Residual APC Effects
The combined figure shows mean residuals from the two-way median polish, averaged across cells for each APC dimension. Points represent the estimated effect; lines show linear trend fits.
Figure 1: Median polish APC effects across all 6 surveys. Columns: Age (blue), Period (green), Cohort (pink). Each point is the mean residual for that dimension value.
2.2 Direct Medpolish Effects
The direct effects are the row and column effects extracted directly by medpolish(). Each dimension appears in two slices (e.g., Age is a row effect in both the AP and AC matrices), providing two independent estimates. Agreement between slices indicates a robust effect; divergence indicates sensitivity to the conditioning dimension.
Figure 2: Direct row/column effects from medpolish. Each dimension has two estimates from different matrix slices (solid vs dashed lines). Close agreement = robust effect.
2.3 Residual Diagnostics
Cell-level residuals from each matrix slice, plotted against the third dimension. Systematic patterns in residuals indicate effects not captured by the additive row + column decomposition.
Figure 3: Median polish residuals across surveys. Gray dots = individual cell residuals; colored line = mean residual trend. Structured patterns indicate real effects of the third dimension.
3 Variance Decomposition
For each matrix slice, the table shows the percentage of total cell-mean variation (sum of squares) explained by row effects, column effects, and residuals.
Note
Because median polish uses medians (not means), the decomposition is not orthogonal — component percentages may not sum to exactly 100%.
AP Matrix: Age vs Period
Show code
all_var |>filter(slice =="AP", component !="Total") |>select(Survey = survey, Component = component, `SS`= ss, `% of Total`= pct) |>mutate(`SS`=round(`SS`, 3),`% of Total`=round(`% of Total`, 1)) |>pivot_wider(names_from = Component, values_from =c(`SS`, `% of Total`)) |>select(Survey,`Age SS`=`SS_Age`, `Age %`=`% of Total_Age`,`Period SS`=`SS_Period`, `Period %`=`% of Total_Period`,`Residual SS`=`SS_Residual`, `Residual %`=`% of Total_Residual`) |>kable(digits =1, caption ="AP matrix: Age dominates Period in every survey.") |>kable_styling(bootstrap_options =c("striped", "hover", "condensed"),full_width =FALSE) |>column_spec(2:3, bold =TRUE)
Table 1: AP matrix: Age dominates Period in every survey.
Survey
Age SS
Age %
Period SS
Period %
Residual SS
Residual %
GSS
6.7
84.6
0.3
3.3
1.7
21.4
NHANES
1.7
73.6
0.1
4.9
0.4
15.4
MEPS
4.9
93.1
0.1
1.9
0.1
2.8
NHIS
15.6
104.4
0.2
1.3
0.5
3.5
CPS
14.5
95.7
0.0
0.3
0.5
3.1
BRFSS
4.6
80.6
0.8
14.4
0.6
10.6
AC Matrix: Age vs Cohort
Show code
all_var |>filter(slice =="AC", component !="Total") |>select(Survey = survey, Component = component, `SS`= ss, `% of Total`= pct) |>mutate(`SS`=round(`SS`, 3),`% of Total`=round(`% of Total`, 1)) |>pivot_wider(names_from = Component, values_from =c(`SS`, `% of Total`)) |>select(Survey,`Age SS`=`SS_Age`, `Age %`=`% of Total_Age`,`Cohort SS`=`SS_Cohort`, `Cohort %`=`% of Total_Cohort`,`Residual SS`=`SS_Residual`, `Residual %`=`% of Total_Residual`) |>kable(digits =1, caption ="AC matrix: Age dominates Cohort, though Cohort contributes modestly (4-21%).") |>kable_styling(bootstrap_options =c("striped", "hover", "condensed"),full_width =FALSE) |>column_spec(2:3, bold =TRUE)
Table 2: AC matrix: Age dominates Cohort, though Cohort contributes modestly (4-21%).
Survey
Age SS
Age %
Cohort SS
Cohort %
Residual SS
Residual %
GSS
6.4
76.8
1.6
19.0
0.6
7.2
NHANES
1.8
77.5
0.2
7.5
0.3
10.9
MEPS
3.8
71.9
0.4
7.5
0.1
1.7
NHIS
16.3
103.8
0.6
4.0
0.2
1.1
CPS
12.5
82.8
0.7
4.6
0.1
0.4
BRFSS
8.4
143.4
1.2
21.2
0.2
4.1
PC Matrix: Period vs Cohort
Show code
all_var |>filter(slice =="PC", component !="Total") |>select(Survey = survey, Component = component, `SS`= ss, `% of Total`= pct) |>mutate(`SS`=round(`SS`, 3),`% of Total`=round(`% of Total`, 1)) |>pivot_wider(names_from = Component, values_from =c(`SS`, `% of Total`)) |>select(Survey,`Period SS`=`SS_Period`, `Period %`=`% of Total_Period`,`Cohort SS`=`SS_Cohort`, `Cohort %`=`% of Total_Cohort`,`Residual SS`=`SS_Residual`, `Residual %`=`% of Total_Residual`) |>kable(digits =1, caption ="PC matrix: Cohort dominates Period once age is removed.") |>kable_styling(bootstrap_options =c("striped", "hover", "condensed"),full_width =FALSE) |>column_spec(4:5, bold =TRUE)
Table 3: PC matrix: Cohort dominates Period once age is removed.
Survey
Period SS
Period %
Cohort SS
Cohort %
Residual SS
Residual %
GSS
4.3
41.3
12.9
124.3
0.6
6.0
NHANES
0.2
7.6
1.9
78.1
0.3
14.3
MEPS
0.2
3.1
5.4
104.0
0.1
2.0
NHIS
6.5
40.2
21.6
134.0
0.4
2.4
CPS
2.8
18.6
18.4
122.1
0.1
0.9
BRFSS
4.5
75.4
7.6
128.6
0.2
3.9
4 Linear Trend Tests
Linear models fit to the mean-residual effects test whether each APC dimension has a significant linear trend. Slopes represent the change in residual effect (SRH units) per unit increase in the dimension.
# Build a concise summary tablesummary_tbl <- all_var |>filter(component !="Total") |>select(survey, slice, component, pct) |>mutate(pct =round(pct, 1)) |>unite("slice_comp", slice, component, sep =": ") |>pivot_wider(names_from = slice_comp, values_from = pct)kable(summary_tbl,caption ="Variance decomposition summary (% of total SS) across all slices and surveys.") |>kable_styling(bootstrap_options =c("striped", "hover", "condensed"),full_width =TRUE, font_size =12) |>add_header_above(c(" "=1, "AP (Age vs Period)"=3,"AC (Age vs Cohort)"=3,"PC (Period vs Cohort)"=3))
Table 5: Variance decomposition summary (% of total SS) across all slices and surveys.
AP (Age vs Period)
AC (Age vs Cohort)
PC (Period vs Cohort)
survey
AP: Age
AP: Period
AP: Residual
AC: Age
AC: Cohort
AC: Residual
PC: Period
PC: Cohort
PC: Residual
GSS
84.6
3.3
21.4
76.8
19.0
7.2
41.3
124.3
6.0
NHANES
73.6
4.9
15.4
77.5
7.5
10.9
7.6
78.1
14.3
MEPS
93.1
1.9
2.8
71.9
7.5
1.7
3.1
104.0
2.0
NHIS
104.4
1.3
3.5
103.8
4.0
1.1
40.2
134.0
2.4
CPS
95.7
0.3
3.1
82.8
4.6
0.4
18.6
122.1
0.9
BRFSS
80.6
14.4
10.6
143.4
21.2
4.1
75.4
128.6
3.9
6.1 Key Findings
Age dominates SRH variation in every survey and every matrix slice. In the AP matrix, Age explains 74–104% of total SS versus 0.3–14% for Period.
Cohort effects are secondary but real. In the AC matrix, Cohort explains 4–21% of variation. In the PC matrix (after removing age), Cohort dominates Period (78–134% vs 3–75%).
Period effects are minimal. Period is consistently the weakest dimension, explaining <5% in the AP matrix for most surveys (except BRFSS at 14%).
Linear trends in the residual effects are mostly non-significant. Only NHANES and BRFSS show significant positive cohort slopes, and MEPS shows a significant positive period slope.
Direct effects agree across slices for Age (robust), but diverge for Period and Cohort (reflecting the fundamental APC identification problem).
These results are consistent with the parametric BHAPC analysis: SRH is primarily structured by age, with modest birth-cohort variation and negligible calendar-time trends.